-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making UDS error handling and recovery more robust. #189
Conversation
Some notes:
|
}); | ||
} | ||
|
||
function trySetNewSocket(client) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very open to a naming convention other try
in trySetNewSocket()
} | ||
} | ||
|
||
function maybeAddUDSErrorHandler(client, lastSocketCreateTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very open to a naming convention other than maybe
in maybeAddUDSErrorHandler()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get too hung up on names, unless I think it doesn't fit the style. Not too much of a naming style in here so it's fine. :) Spent a bunch of time making sure I understood the new paths in here, for this function and others. I didn't see any issues, looks good to me.
@@ -526,3 +502,59 @@ Client.prototype.childClient = function (options) { | |||
|
|||
exports = module.exports = Client; | |||
exports.StatsD = Client; | |||
|
|||
function udsErrorHandler(client, lastSocketCreateTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'm not sure if you want these handlers here at the end
- I'm not sure if you'd prefer
client
bethis
and we just put these helpers on theClient
prototype
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine as is. Great to see the nice docs there.
@@ -354,7 +321,16 @@ Client.prototype.sendMessage = function (message, callback) { | |||
return; | |||
} | |||
|
|||
if (!this.socket) { | |||
const socketWasMissing = !this.socket; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be considered a breaking change that I've added an attempt to recover here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks. Not sure yet if this will be a minor or major release, but I think this part by itself would fit in ok with what has been minor releases.
@@ -48,6 +51,8 @@ const Client = function (host, port, prefix, suffix, globalize, cacheDns, mock, | |||
this.cacheDnsTtl = options.cacheDnsTtl || CACHE_DNS_TTL_DEFAULT; | |||
this.host = options.host || process.env.DD_AGENT_HOST; | |||
this.port = options.port || parseInt(process.env.DD_DOGSTATSD_PORT, 10) || 8125; | |||
this.path = options.path; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I've added four public fields to
Client
(path
,stream
,udsGracefulRestartRateLimit
andisChild
), this may not be a desirable addition to the "exported" interface - I haven't updated the TypeScript definition(s) to reflect new exports / fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine, and you shouldn't add these to TypeScript definitions, others ones aren't mentioned in there like this (I don't think)
To verify this change I've been running const process = require("process");
const hotShots = require("./index");
const DATADOG_ADDRESS = process.env.DATADOG_ADDRESS;
function toPath(datadogAddress) {
if (datadogAddress.startsWith("unix://")) {
// NOTE: The length of the "unix://" prefix is 7.
return datadogAddress.slice(7, datadogAddress.length);
}
return datadogAddress;
}
function createUDSClient() {
const opts = {
mock: false,
maxBufferSize: 0,
path: toPath(DATADOG_ADDRESS),
protocol: "uds",
udsGracefulErrorHandling: true,
udsGracefulRestartRateLimit: 1000,
};
return new hotShots.StatsD(opts);
}
function makeEventLoop(udsClient) {
return function compute() {
console.log(
JSON.stringify({
message: "Doing very big computation",
level: "info",
_timestamp: new Date(),
}),
);
udsClient.increment("computation.uds", 1, 1, []);
};
}
async function main() {
const udsClient = createUDSClient();
const compute = makeEventLoop(udsClient);
setInterval(compute, 500);
}
if (require.main === module) {
main();
} via
and then turning on and off the
|
@dhermes Thanks for the PR and all of the info! Neat to see local-dd-agent too. I see the tests are still failing in here, looks like something isn't going right on the Mac version. Also as noted in #188 looks like we have some overlap here, haven't had a chance yet myself to look over everything and understand. |
Yup; I haven't had a chance to parse the test failures but will attempt to tackle it next week. I didn't want to devote too much time building a cathedral before getting sign off from maintainers. Once / if you think this looks promising I can start to tackle the test failures? I'd love an assist if the Travis CI failures look "easy" to tackle for you (but by all means not needed). Also, since there are 4 distinct issues, I can send 4 different PRs if you'd rather (the commits themselves in this PR can be broken up nicely to fix each of the 4 issues separately). |
if (process.platform === 'linux') { | ||
return [ | ||
os.constants.errno.ENOTCONN, | ||
os.constants.errno.ECONNREFUSED, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, I didn't realize there was a constant here available on Node 8 and above. Good to know.
I took a pretty quick pass on this, but great to see all the work done and makes general sense to me. When the tests are passing (and I have a little more time) I'll take another look at this and see if I have any more feedback. This UDS handling has been looking ugly for awhile (from what I can gather from the bug reports), glad to see someone take it all on and make it work better. I may be able to help with the tests here at some point (do run a Mac here) if you need it, just been short on time recently. It's not clear to me right away what's going wrong in there- oddly enough it's a mix of all sorts of different tests failing in there (https://travis-ci.org/github/brightcove/hot-shots/jobs/730370815#L1518). |
…` in constructor. Also using these in places where `options` was used.
This way we can retry using this helper.
This is in advance of a change that will allow these to be OS specific.
See https://nodejs.org/docs/latest-v12.x/api/errors.html for an example (in `error.errno`) explaining why error codes are negative.
This way `this.socket` is only replaced if `createTransport()` returned an actual socket (vs. `null`).
See: https://travis-ci.org/github/brightcove/hot-shots/jobs/730354851 This was due to `eslint` failures. Also changed some `==` to `===` and added / removed some `;` in lines that needed / didn't need.
This "bug" was introduced to my bugfix in a re-factor.
@bdeitte Tests are green here so should be good to review. |
Awesome. I spent some time with tonight and don't see any issues. Thanks again for this! Going to release in a minor release here in a bit. |
Fixes #128.